A fast incremental extreme learning machine algorithm for data streams classification

نویسندگان

  • Shuliang Xu
  • Junhong Wang
چکیده

Data streams classification is an important approach to get useful knowledge from massive and dynamic data. Because of concept drift, traditional data mining techniques cannot be directly applied in data streams environment. Extreme learning machine (ELM) is a single hidden layer feedforward neural network (SLFN), comparing with the traditional neural network (e.g. BP network), ELM has a faster speed, so it is very suitable for real-time data processing. In order to deal with the challenge in data streams classification, a new approach based on extreme learning machine is proposed in this paper. The approach utilizes ELMs as base classifiers and adaptively decides the number of the neurons in hidden layer, in addition, activation functions are also randomly selected from a series of functions to improve the performance of the approach. Finally, the algorithm trains a series of classifiers and the decision results for unlabeled data are made by weighted voting strategy. When the concept in data streams keeps stable, every classifier is incrementally updated by using new data; if concept drift is detected, the classifiers with weak performance will be cleared away. In the experiment, we used 7 artificial data sets and 9 real data sets from UCI repository to evaluate the performance of the proposed approach. The testing results showed, comparing with the conventional classification methods for data streams such as ELM, BP, AUE2 and Learn ++ .MF, on most data sets, the new approach could not only be simplest in the structure, but also get a higher and more stable accuracy with lower time consuming. © 2016 Published by Elsevier Ltd. w C S s t a d l f S o a t a r W g s

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Cost-sensitive Ensemble Classification based on Extreme Learning Machine for Mining Imbalanced Massive Data Streams

In order to lower the classification cost and improve the performance of the classifier, this paper proposes the approach of the dynamic cost-sensitive ensemble classification based on extreme learning machine for imbalanced massive data streams (DCECIMDS). Firstly, this paper gives the method of concept drifts detection by extracting the attributive characters of imbalanced massive data stream...

متن کامل

Classification of encrypted traffic for applications based on statistical features

Traffic classification plays an important role in many aspects of network management such as identifying type of the transferred data, detection of malware applications, applying policies to restrict network accesses and so on. Basic methods in this field were using some obvious traffic features like port number and protocol type to classify the traffic type. However, recent changes in applicat...

متن کامل

A Hybrid Machine Learning Method for Intrusion Detection

Data security is an important area of concern for every computer system owner. An intrusion detection system is a device or software application that monitors a network or systems for malicious activity or policy violations. Already various techniques of artificial intelligence have been used for intrusion detection. The main challenge in this area is the running speed of the available implemen...

متن کامل

Machine Learning Models for Housing Prices Forecasting using Registration Data

This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...

متن کامل

An Empirical Comparison of Bayesian Network Parameter Learning Algorithms for Continuous Data Streams

We compare three approaches to learning numerical parameters of Bayesian networks from continuous data streams: (1) the EM algorithm applied to all data, (2) the EM algorithm applied to data increments, and (3) the online EM algorithm. Our results show that learning from all data at each step, whenever feasible, leads to the highest parameter accuracy and model classification accuracy. When fac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 65  شماره 

صفحات  -

تاریخ انتشار 2016